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Abstract 

Slow feature analysis (SFA) is a new technique for extracting slowly varying features from a quickly 
varying signal. It is shown here that SFA can be applied to nonstationary time series to estimate a single 
underlying driving force with high accuracy up to a constant offset and a factor. Examples with a tent 
map and a logistic map illustrate the performance. 

1 Introduction 

Nonlinear time series analysis generally assumes stationarity fsee lCASDAGLi Il997t ISchreibefJ . Il999l for an 
overview). However, many time series are actually nonstationary for various reasons, such as temperature 
drift in the experimental setup, decreasing reservoirs in (bio)chemical reactors or ecological systems, global 
warming in climate data, varying heart rate in cardiology, or vibrato/tremor in vocal fold vibrations. Such 
nonstationarities can be modeled by underlying parameters, referred to as driving forces, that change the 
dynamics of the system smoothly on a slow time scale or abruptly but rarely, e.g. if the dynamics switches 
between different discrete states. 

If a test reveals that a time series is nonstationary (jWiTT et AL.l Il998|) . one can still apply meth- 
ods of stationary time series analysis if one determines sections where the driving force has similar val- 
ues and analyzes these sections as one stationary time series. This can be done by first slicing the time 
series in to windows of equa l size a nd then grouping the windows based on similarity measures of the dy- 
namics ( Manuca fc: SavitI I1996(1 . This division of the time series can be avoided by the technique of 



overembedding. If the embedding dimension is sufficiently high, then similar embedding vectors automati- 
cally belong to similar values of the driving f orces and al l avail able data can be used for analysis, such as 
forecasting or nonlinear noise reduction (jHEGGER et AL.l l200Cl() . 

However, in some cases, e.g. in the analysis of EEG data, one is particularly interested in revealing 
the driving forces themselves, which is in principle only possible u p to an invertible trans formation. One 
standard method for visualizing driving forces is the recurrence plot ijECKMANN et AlI 1X987^1 . but it is often 
difficult to interpret. Methods have been developed to estimate driving forces based on the find ing that the 
recurrence plot of a time series is similar to the recurrence plot of its underlying driving forces ijCASDAGLi 
Il997j : Schreibe rI Il999l). Another technique for the reconstruction of driving forces has been presented 



m l)VERDEs'^T'AL.ll20oi|) . Here I present an alternative approach based on slow feature analysis, a new 
technique developed in the field of theoretical neurobiology. 



2 Slow Feature Analysis 

Slow feature analysis (SFA) has been originally developed in context of an abs tract model of unsupervised 
lea rning of invariances in the visu al system of vertebrates (jWiSKOTtl Il998|) and is described in detail 
in l|WlSKOTT &: Sejnowsk1|200^ . 
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The general objective of SFA is to extract slowly varying features from a quickly varying signal. For a 
scalar output signal it can be formalized as follows. Let x = x{t) be an TV-dimensional input signal where t 
indicates time and x — [xi, xmY" is a vector. Find the input-output function g{x) that generates a scalar 
output signal 

y{t) g{x{t)) (1) 
with most slowly temporal variation possible as measured by the variance of the time derivative: 



mmimize 



(r) (2) 



with (•) indicating the temporal mean. For convenience and to avoid the trivial constant solution the output 
signal has to meet the following constraints: 

(y) = (zero mean) , (3) 
{y^) = 1 (unit variance) . (4) 

This is an optimization problem of variational calculus and as such difficult to solve. However, if we 
constrain the input-output function to be a linear combination of some fixed and possibly nonlinear basis 
functions, the problem becomes tractable and can be solved in the following way. Let h' = h'{x) be a vector 
of some fixed basis functions. To be concrete assume h' contains all monomials of degree 1 and 2. Applying 
h' to the input signal x{t) yields the nonlinearly expanded signal z'{t): 

h'{x) := [xi, X2, xn, xj, XiX2, ■-, x%]'^ , (5) 
z'{t) := h'{x{t)). (6) 

Assume g{x) is a linear combination of the basis functions plus a constant c, generating the output signal 
yit) = gix{t)): 

g{x) := c + a'^h'{x), (7) 
y{t) = c + a'^z'{t). (8) 

Since h' contains all monomials of degree 1 and 2, g{x) can be any polynomial of degree 2, if the constant 
c and the coefficient vector a' are chosen correspondingly. 

For reasons that become clear below, it is convenient to sphere (or whiten) the expanded signal and 
transform the basis functions accordingly: 

h{x) := S {h'{x)-{z')), (9) 
z{t) := S{z'{t)-{z')), (10) 

with the sphering matrix S chosen such that the signal components have a unit covariance matrix, i.e. 
{zz'^) — /, which can be easily done with the help of principal component analysis (PC A). The signal 
components have also zero mean, (z) = 0, since the mean values have been subtracted. 

The input-output function and the output signal can now be written in these transformed basis functions 
and sphered expanded signal, respectively: 

g{x) = a^hix), (11) 
y{t) = a^z{t). (12) 

This is a form in which the constraints JHIEJ are particularly easy to meet, since we find for any coefficient 
vector a with norm 1, 

{y) = (z) ^0, (13) 

=0 

{y^) ^ {zz^)a = a^a = l, (14) 



=j 

which also motivates why constant c has been dropped in pilll2l) . 
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Thus the optimization problem reduces to finding the normahzed coefficient vector a that minimizes 

{f)^a^{zz^)a, (15) 

which is obviously the normalized eigenvector of the time-derivative covariance matrix {zz'^) with the small- 
est eigenvalue, which again can be easily found by PCA. The output signal y{t) is then given by H12|l . Notice 
that y{t) is usually uniquely determined up to the sign. 

In summary SFA consists of basically four steps: (i) expand the input signal with some set of fixed 
possibly nonlinear functions; (ii) sphere the expanded signal to obtain components with zero mean and 
unit covariance matrix; (iii) compute the time derivative of the sphered expanded signal and determine 
the normalized eigenvector of its covariance matrix with the smallest eigenvalue; (iv) project the sphered 
expanded signal onto this eigenvector to obtain the output signal. The input-output function is given by 
Hll|l but is not needed here, since we are only interested in the extracted output signal. 

In practice one has to work with time series Xi instead of continuous signals x{t), but the transfer of the 
algorithm described above to time series is straight forward. The time derivative is simply computed as the 
difference between successive data points without wrap-around assuming a constant sampling spacing At. 

The description of SFA given above assumes that all monomials of degree 1 and 2 are used for signal 
expansion. However, any other set of basis functions could be used as well, e.g. monomials of higher 
degree or radial basis functions. It is also straight forward to extend SFA to more than one output signal 
component, in which case additional components should be uncorrelated to earlier ones. Under this constraint 
additional c omponents can be computed by using the normalized eigenvectors of (|15|l with the next larger 
eigenvalues ijWiSKOTT SE.^NOWSKil2002^ . SFA requires that all components vary to some extent. Thus 
if there are constant components in the input signal or the expanded signal, these are simply discarded. 



3 Examples 

In the following I will present two examples with time series Wi derived from a tent map and a logistic map 
to illustrate the properties of SFA. I have also done simulations with the Lorenz System, but results were 
very unreliable and sensitive to parameter variations and noise. The underlying driving force will always be 
denoted by 7 and may vary between —1 and 1 either smoothly or rarely, but with a comparable time scale 
of variation (as defined by the variance of its time derivative @). 

Taking the time series Wi directly as an input signal would not give SFA enough information to estimate 
the driving force, because SFA considers only data (and its derivative) from one time point at a time. Thus 
it is necessary to generate an embedding- vector time series as an input. Here embedding vectors at time 
point i with delay r and dimension m are defined by 

Xi := [Wi_^(„i_i)/2, Wi— r((m-l)/2-l), Wj+7.(m-l)/2]^ , (16) 

for scalar Wi and odd m. The definition can be easily extended to even m, which requires an extra shift of 
the indices by r/2 or its next lower integer to center the used data points at i. Centering the embedding 
vectors results in an optimal temporal alignment between estimated and true driving force. 

The following simulations are based on 6000 data points each and were done with Matlab (Release 13). 

3.1 Tent Map 

As a first example consider a time series generated by an iterated tent map ( CasdaglI Il997t) 




(17) 

if 0<w»<|(l-7^) 
w^+i = { -2w^-J^ + 2 if i(l-7^) <w^< 1(2-7,) , (18) 

if 1(2-7,) <w»<l 

which maps the interval [0, 1] onto itself with a functional form "/\" for 7 = — 1 (or 7 = 0). The parameter 7 
shifts this wedge cyclically to the left until it becomes a "\/" for 7 = 7 = +1. FigureHshows the true driving 
force, the time series, and the estimated driving force with m — 10, t — 1, and third order polynomials for 
SFA. The correlation between true and estimated driving force is r = 0.96. Note that the scale and offset 
of the estimated driving force are arbitrarily fixed by the constraints and that the sign is random. The 
axes were therefore chosen such that the curves in the bottom graphs of this and the following figures are 
optimally aligned with each other. 
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time series (slow / tent map) 



5 1 - 
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Figure 1: True slowly varying driving force 7^ (bottom, solid line), time series Wi (top) derived from the 
driving force with the tent map, and estimated driving force 7f** (bottom, dots). The correlation between 
true and estimated driving force is r = 0.96. 



3.2 Logistic Map 

As a second example consider a time series derived from a logistic map 

w,+i = i3.6 + 0Aj,)w,{l-w,), (19) 

which maps the interval [0, 1] onto the interval [0, 0.9 + 0.17^] and has the shape of an upside-down parabola 
crossing the abscissa at and 1. Parameter 7 governs the height of the parabola. Figure [3 shows the 
true driving force, the time series, and the estimated driving force with m = 3, r = 1, and second order 
polynomials for SFA. The correlation between true and estimated driving force is r = 0.997. 



4 Technical Remarks 

Choice of Parameters SFA is basically parameter-free except for the general choice of the class of 
nonlinear basis functions. The only other parameters to choose in this method are the dimension m and the 
time delay r of the embedding vectors. The value of (y^) is a fairly reliable indicator for a good choice of 
values for m and r. The smaller (y^) the better, since there is no trivial way of achieving small {y'^) -values 
(except if the number of basis functions comes close to the number of data points). I typically test a certain 
range of r- values and successively increase the m- value until performance (measured in terms of (y^) or r) 
is satisfactory. 

Rarely Varying Driving Forces If SFA is designed to extract slowly varying features from a signal, how 
about driving forces that apparently violate this assumption heavily, e.g. if they vary abruptly and jump 
between different discrete values? The definition of slowness given in ||2Jl does actually not depend on the 
smoothness of the output signal, it may equally well be a signal that switches abruptly between different 
discrete values as long as the jumps occur rarely enough to lead to the same variance of the time derivative. 
Whether the order of the values can be recovered depends on the dynamics of the driving force. SFA will 
tend to order the values such that jumps occur between nearby values of the estimated driving force. Thus, 
if the original values are -0.5, 0, and -1-0.5 and there are many direct jumps between -0.5 and -1-0.5, SFA 
might map the three values onto -1, and 0, respectively, thereby changing the order of the second and 
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Figure 2: True slowly varying driving force 7i (bottom, solid line), time series Wi (top) derived from the 
driving force with the logistic map, and estimated driving force 7^** (bottom, dots). The correlation between 
true and estimated driving force is r = 0.997. 

third value. Figures |31 and 01 show results for the two systems if a rarely instead of a slowly and smoothly 
varying driving force is used. 
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true driv. force (rare) and est. driv. force (rare / tent map / emb., m=1 0,t=1 / SFA ) (r=0.997) 
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Figure 3: True rarely varying driving force 7^ (bottom, solid line), time series Wi (top) derived from the 
driving force with the tent map, and estimated driving force 7f** (bottom, dots). The correlation between 
true and estimated driving force is r = 0.997. 
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Figure 4: True rarely varying driving force 7^ (bottom, solid line), time series Wi (top) derived from the 
driving force with the logistic map, and estimated driving force 7^** (bottom, dots). The correlation between 
true and estimated driving force is r = 0.997. 



High-Dimensional Input Data Probably the most severe limitation of SFA is the fact that the number 
of monomials (or any other basis functions) grows quickly with the dimensionality of the embedding vectors 
(curse of dimensionality). One therefore might run out of computer memory before a high enough m- value 
is reached. However, higher-dimensional problems could be dealt with in a hierarchical fashion by breaking 
the embedding vectors into smaller parts which are first analyzed separately and the results of which are 
then combined for a fi nal analysis. Applying th is hierarchical scheme to 65-dimensional input vectors has 
been demonstrated in ||Wiskott fc SejnowskJ .Eo02'1. 

Accuracy of the Estimated Driving Force It might be surprising that in the examples above SFA is 
able to estimate the driving forces with such an accuracy up to a factor and a constant offset, even though 
the estimation is undetermined up to any invertible transformation, not only scaling and shift. One reason 
for that is that the driving forces used are relatively slow already and could probably not be improved much 
by an invertible nonlinear transformation. However, even if that were not the case one might hope that 
in practice for the lower-dimensional embedding vectors (m = 10 instead of, e.g., m = 50) SFA has to 
find a relatively simple input-output function g{x), which is more likely to preserve the exact shape of the 
driving-force curve than a more complicated one. Further experiments are needed to verify this. 

Noise Sensitivity How sensitive is the method to noise? One might suspect that SFA is quite sensitive 
to noise, since eigenvectors of smallest eigenvalues are used. However, focusing on small eigenvalues is only 
done after sphering and taking the time derivative, so that small but quickly varying noise components 
typically have large eigenvalues and are discarded by SFA. Graceful degradation with noise was also found 
in (WiSKO TT fc Sejnowski, 2002). Thus, using the eigenvectors with smallest eigenvalues in itself does 
not induce noise sensitivity. I have done simulations with the examples presented here by adding Gaussian 
white noise to the signals before embedding. Adding 10%, 20%, and 50% noise to the tent map time series 
reduced the correlation between true and estimated driving force from r — 0.96 to about 0.94, 0.90, and 
0.71, respectively; adding 2%, 5%, and 10% noise to the logistic map time series reduced the correlation 
from r = 0.997 to about 0.97, 0.87, and 0.70, respectively. Thus we see, that in these examples the method 
is in fact fairly robust with respect to noise. 
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Multip le Driving Forces SFA can be easily extended to the extraction of multidimensional output 
signals ((Wiskott fc SejnowskA|2002I) . which could be used to estimate multiple driving forces. However, 
this might require longer time series and higher-dimensional embedding vectors. If the different driving 
forces are not clearly separated by different time scales, SFA might only be able to estimate them up to a 
linear mixing transformation. In that case independent component analysis (ICA) m ight be able to sepa. rate 
them, if they are statistically independent and not more than one is Gaussian (see IHyvarinenI Il999l for 
an overview of ICA methods). 

5 Conclusion 

In this paper I have demonstrated that slow feature analysis (SFA) can be applied to the problem of esti- 
mating a driving force of a nonstationary time series. The estimates are fairly accurate except for a scaling 
factor and a constant offset, which cannot be extracted in general. The method works for slowly as well as 
rarely varying driving forces, although in the latter case the order of the different discrete values might not 
be recoverable. The slowness principle also provides robustness to noise, which is typically quickly varying 
and therefore suppressed by SFA as far as possible. 

There are still a number of open questions. The next steps of investigation will have to include a 
comparison with standard methods, such as the technique of recurrence plots, and an exploration of the 
conditions under which SFA can be applied with similar success as demonstrated here. It is also necessary 
to apply SFA to real world data. This has been successfully doi ie in the context of learning receptive field 
properties of the visual cortex based on natural image sequences ()Berkes fc WiskottI 12003(1 . but that was 
not for extracting driving forces. In any case, since SFA works very differently from other techniques that 
have been applied to the estimation of driving forces, one can hope that it at least complements these other 
techniques in some cases. 
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